perm filename PRIMER.RLL[RLL,DBL] blob sn#605137 filedate 1981-08-10 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00011 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002		RLL PRIMER
C00004 00003	Section 1:  Philosophy
C00011 00004	Section 2: Dirt
C00017 00005		HOW TO ACTUALLY START UP
C00027 00006		General Conventions
C00038 00007	Section 3, Part A:
C00039 00008	Section 3, Part B:
C00040 00009		Conclusion
C00042 00010	Appendix A: Creating a new RLL system
C00045 00011	Bibliography
C00046 ENDMK
C⊗;
	RLL PRIMER

This document is designed to help a novice user ease into the RLL system.
The first section provides an introduction to the RLL philosophy -- this
description is intentionally sparse; the interested reader is referred to
[Greiner] for more directed propoganda.  Section 2 tells the "dirt" -- 
everything necessary to actually start up a new version of RLL.
It includes many boring but necessary
details -- such as the names of the various files, and where each is stored.
(Note the appendix-memo, [Greiner&Lenat] refers to these file; here we
get a bit more explicit.)
Section 3 contains two annotated dribble files of actual RLL sessions -
the first is a short demonstration which can be given; while the second
encodes the solution to part of a more elaborate programming task.
Appendix A provides an outline for how to import the RLL system to a
new host machine.

Section 1:  Philosophy

The goal of a representation language language is to store facts about
the representation being used as well as facts about the domain being
modelled.  The core idea is that one can use a reasoning system to
understand and reason about representation-related issues -- i.e. that
it is meaningful to talk about the domain of representation.

	"To everything a unit"
In this system, we use the same basic representation to encode all facts -- 
both representation-related and domain dependent.
Everything (we may ever want to reason about) is encoded as a unit,
whose slots store relevant facts about that entity.
There is a unit, for example, associated with every type of slot.
The fact that
the Inverse of Isa is Examples is stored in the Isa unit, for example,
as is the fact that the value of U:Isa, for any unit U, is a set of
units which each represent a class of objects.
Such a unit may also store an algorithm for computing this U:S value.
(E.g. one such procedure would first see if there was some value physically
stored as the value of the S slot of the U unit, and if not,
examing each prototype, P, of this U unit, returning the first P:S value 
it finds.)

There are similarly units which store facts about different modes of
inheritance (eg there is one inheritance associated with the Isa relation, and
another with the SuperClass relation,) or control structure, or ...
Many LISP functions, those which RLL must reason about, have a unit
associated with them. (Examples include MapUnion, UNION, CAR ...)

There are two major sub-themes on this "a unit for everything" topic:
explicitness and proper placement.
Every component of an ideal RLL-based system should explicitly
state what it "means", under what circumstances this component
is appropriate, and how, under what type of interpreter,
and any other fact which might help you (or some subsequent
user) to understand this morsel -- clearly a very (overly?) ambitious task.
By encoding each such part as a unit, the user is allowed to specify 
(or partially specify) this component at several levels, in many orthogonal ways.
It is still, of course, up to the user to create the appropriate types of slots,
and fill them the correct type of data.
All the RLL structure can provide is this frame-based formalism;
and documents (like this) which strongly encourage storing such information.

We further advocate using creating units to house all the facts about
a given concept.
For example, there are many facts one may want to say about a format such
as "Set" -- for example, a set contains no duplicate elements,
and  the order of the elements is inconsequential.
More procedural information may include a piece of LISP code which tells how
to add a new element to an existing set (for example, to do nothing if that
element is already present; otherwise to CONS it onto the front of the list
representing the set), or how to verify that some list in fact represents a set
(only if it contains no repeated elements).
(Other facts include a list of ways to encode a set 
-- as a bit vector, linked list, or tree, ...;
and how to convert a bag into the corresponding set, under each of the possible
representational schemes, ...)

There are often several alternate ways of storing such information.
One all-too-common method is to put some subset of the data at
each location which uses that fact 
in effect
smearing the relevant facts around the data base 
(Here, this would involve building
the set-addition operation into the adding value function, and the
convert from bag to set process into some general format-conversion process.)
This approach does lead to several complications 
-- most apparent when the formalization changes, and the user is left
with only disjoint clumps of code,
devoid of the needed fact that these were SET-related facts.

For this and other reasons, 
<ftnote>[such as ease of adding in new, related things -- such as the new format,
OrderedSet]
it makes sense to store all SET-related facts in one place.
Each appliation would explicitly point here when it wanted to determine some
properties of SETs.
To derive the code it may need, each application would then set the appropriate
deductive mechanism on this unit; reasoning from the facts stored there.
Section 2: Dirt

	ENVIORNMENT

RLL's current host machine is Rand-Ai, a DEC20 machine, running a modified
TOPS20 operating system.
RLL itself is built on InterLisp, and is (currently)
strongly dependent on various of IL's bells and whistles.
The user may pretend all the units he needs are currently "in core" --
that is, whenever he wants to see a unit, it is available for examination;
and any change he may make to that unit is preserved.
This "demand unit swapping" facility is provided by CORLL,
a module which extends the HASH package which comes with InterLisp.
(See [Smith] for additional details.)

	WHERE IT IS

Nomenclature: *.. files are LISP source code, *.COM are the compiled versions
of those files (based on non-extention part of the file-name), *.KB are
the hash-files CORLL uses to store the units, associated with the Knowledge
Base of that name, (*.PAGE files are the hash-files used during a specific
session,) TRACE*.* are the dribblefiles -- preserving a record of the current
session (an example name is TRACEB.MAY23; which is the 2nd (ie Bth) editing
performed on 23-May).

The latest version of all the files and KBs used for RLL are in <GREINER.RLL>.
This includes
	UTIL.COM - all the miscellaneous accessing functions and 
global variables, together with body of (to RG) useful macros and nice
debugging aides.
	DAVE.COM - UTIL itself loads this in.  This stores accessing functions
which are one step up (ie more specific to our applications) from the primitive
CORLL functions.
	AUX.COM, ? ...  -- other files optionally used by UTIL.
Ignore them for now. (Trust me...)
	RLL.COM - associated with each KB is a file which houses the non-unit
things (eg functions, variables, and who-knows-what-else)
which pertain to that KB.  This is the body of RLL-specific such things.
	
The nucleus for the running system consists of the units stored in the 
following KBs:
	RLL.KB - these units hold the basic hierarchy of the units --
indicating what is a SuperClass of what.  It also holds the TypicalExamples
of each class.
	SLOTS.KB - here are the slots, together with things pertanent only
to slots -- in particular, this part of the hierarchy.  (This may be a bad
decision -- in any event, subject to change.)
	USERS.KB - facts about users, and classes of users, are here.
	LISPFNS.KB - many LISP functions have been dignified with a unit --
in which a variety of facts about that function can be stored; RLL uses
these facts to when it has to reason about that function.

NB:  All 4 of these KBs must be present for RLL to "work".  They are all
loaded "at once" -- i.e. loading RLL causes the other 3 to be opened.

There are also a body of indirectly relevant files which reside in
<GREINER.CORLL>.  As the name implies, these files are use to build the
CORLL sysout -- we will soon see that is created on top of this.
Note: the original version of these files are kept on the [SOCRE] machine,
on <CSD.RLL>.  Hence a given file's internal name will often NOT match
its external name -- the handle Rand-Ai and IL use to refer to this file.
(Also: for efficiency several of these files are SYSLOADed, rather than LOADed.
As a side effect, IL does not notice these files, making it very difficult
to retrieve and edit these functions -- which is just as well, as 
you shouldn't be anyway...)

These CORLL files, plus a few files on <LISP> or <LISPUSERS>, will be
explicitly listed in the next section, which discusses how to transfer over
a new RLL system.

	HOW TO ACTUALLY START UP

	All the following assumes you are about to use RLL on Rand-Ai.
Appendix A tells what should be done to use RLL on some other host machine.
(Such steps would be taken before typing any of the following commands,
of course.)  Anyway, once everything is present, type

@<GREINER.CORLL>CORLL.EXE <cr>
	to the top level.  This places you in a version of LISP, in which
	the various (hopefully most recent) CORLL files have been loaded.
	Note: if that EXE file does not exist, contact GREINER@Rand-Ai, who
	will strife to produce a new one.
	Then type
3←LOAD(<GREINER.RLL>UTIL.COM]
	This will load in the various functions ... needed to run RLL.
	It may also load in other nice files.
	You are now ready to do the actual RLL stuff.
4←START]
	This will first open the nucleus RLL kbs (mentioned above).
	In addition to providing access to the units of these KBs, this opening
	process will load in the relevant associated source files.
	START will then ask if you want any additional KB.  Responding Y
	allows you to select which of the existing KBs to open now.
	(At this writing, there are about 12 such KBs.)
	Note loading a KB will insist any "superior" KBs (ie those Knowledge
	Bases on which this KB depends) be present first.  Hence additional
	KBs may be loaded here as well.
	[Note: this will use certain OpenningOptions, described elsewhere.
	(No, I don't know where either.)]
	 -- this will do what it can to keep the hash files from growing too
		fast; by compacting them if ever the ratio of number of units
		to file size falls below some threshold -- MIN-RATION = 2 initially.

.....  get to work  .....

	If you work long enough, CORLL will decide the time has come to
	bump some units -- which involves writing onto an external file.
	At this random point you will get a message, asking whether such
	writes are to be denied.  In general just say N here.  (We'll
	return to this point later.)

	When you want to finish this session, you have 3 basic options.
	(i) Nothing you did needs to be saved.  Just ↑C out, delete the *.PAGE
	files cluttering up your directory 
	(these are remants of the *.KB files which
	you have been editing), and go on your merry way.
	Please be sure you REALLY want to do this -- nothing will be preserved
	of that entire session, honestly!
	[Note (LOGOUT) will not work here: it has been advised to keep users
	from leaving when any of their KB files are still open.]

	(ii) There are more things you want to keep doing; but you don't want
	to do it just now.  To preserve the state of the world now, type
92←SOS Fred
	Note the absense of parentheses -- this SOS is a LISPXMACRO, which 
	would just get confused by such things.  This actually calls the 
	function RLL-SYSOUT, with certain parameters which I liked.
	(In particular, this setting makes SYSOUTing fairly fast; most
	of the work done for safety is performed when reading in this SYSOUT.)
	Note:  This creates a sysout named Fred.  Any name would work, of course.
	As this will require between 350-400 pages, it may make sense to do
	things like EXPunging beforehand; or to create a <BIG-DIRECTORY>Fred
	sysout, rather than Fred.  Finally, this actually dumps you back at
	top level.
	We will discuss some nuances of reading in this sysout below.

	(iii) The final option involves writing all the KBs out.  This should
	be done when a number of changes have been made, or when you simply
	feel nervous not having a complete set of transportable KBs.
	Here you type
92←CC]
	This function steps through the list of now open KBs, in an order
	which reflects the dependency structure, mentioned above.  For each
	KB, you have the option of either CANCELing the KB, which throws out
	all of your changes SINCE OPENING IT (which may have been several
	SYSOUTs earlier!) or WRITEing it -- which preserves the full KB.
	[As with opening the KBs, you have various options for additional
	things to do when you close them.  These will have to be described
	somewhere, eventually...]

	When this returns, you may want to MAKEFILE any of the file which have
	been changes (unless that have already been performed, by the writing
	process).
	[The LISPXMACRO CLOSEUP writes all the files, using the options stored
	in your user-model.]

------
	After using the SOS command, you may eventually wish to return to this
	RLL image.
	Typing 
@Fred.EXE <cr>
	to the exec will enter this image, and proceed to do some re-initialization
	work.  After (considering) reopening the dribble and trace file, the 
	RLL-AFTERSYSOUT function will copy all the (ReadWrite) KBs from 
	*.PAGE to ___*.Page-Fred,
	where ___ will be the value of the global variable BACKUPDIR 
	(defaulting to "" -- in general this will be something like "<3SCRATCH>").

	Note these *.Page-Fred files are associated with the FRED.EXE sysout.
	If either you or your host system do something which you regret,
	(eg delete the wrong unit, or crash, respectively,) all is not lost:
	All you need to do is rename from ___*.Page-Fred to *.PAGE 
	(make sure you use the same extension number the original page file had!)
	and reenter the Fred sysout.  (Note: maybe I'll simplify the task by
	having the reopening algorithm check first for the exact file; then
	for those ___*.PAGE-FRED things -- this way the user would need only
	delete the exisint *.PAGE files.  Later...)
	Anyway, you'll now be exactly where you were at the start of this session --
	none of the KB modifications performed during that aborted will be 
	preserved.

	Final note: the *.PAGE to ___*.Page-Fred process uses the TENEX function --
	which basically creates an inferior fork, does the work (here a COPY)
	and (in an error-free execution) POPs back up.  
	However, if the desired *.PAGE file isn't found, or if this new file
	causes your allocation quota to be exceeded, or ..., TENEX just stops --
	leaving you in dreaded inferior fork.  At this point, you best move
	is to type POP, to return to LISP, then quickly 
	<control> something-or-other.  At this point that last COPY did not work,
	and you'll have to pull some shenanigans to get this sysout to work.
	Here's where your years of programming experience should pay off --
	as you're on your own.
	General Conventions
First, there are many places when yes/no questions are asked.  Typing Y or y
is sufficient for an affirmative response; N (n) for negative.  No other 
response is allowed.

with respective to names:
	In addition to those names LISP enforces (and/or suggests) [eg ←←←FNS
is the name of variable whose name is a list of functions], there is a system
we have adapted for naming units:
  Any___ refers to the class of all ←←←s (as in AnyDog, or AnyThing)
  Typical___ refers to the typical member of that ___ class -- this unit does
	NOT represent any "real world" entity; only a compilation of standard
	sorts of facts.
  F___ refers to a format - as in FSet or FOneOf.
  I___ refers to an inheritance - as in IExamples.
  My___ refers to the syntactic slot which holds the ___ slot of the meta-unit 
	representing this unit.  (eg MyCreator)
  All___s refers to the slot which extends the ___ slot (eg AllIsas extends Isa).
  ___* refers to the transitive closure of the ___ slot (eg SubClass*).
  To___ refers to slots which store some function -- the function which 
	performs the ___ action -- consider ToGetValue.
  FnFor___ing refers to a slot whose value is a function, which helps perform
	the ___ action.
{see [Greiner] for yet more examples of these unit naming conventions.}

for functions:
	Those beginning with "Default" (as in DefaultGetValue) are usually
values of some slot, generally on the Typical___ unit.  For example,
DefaultPutValue = TypicalSlot:ToPutValue.  When I get around to unitizing
these functions, such information will be stored on the associated unit
(in terms of, say, a back pointer).

for variables:
	All___ is the set of all ___ -- consider AllFNS.

	BELLS and WHISTLES

RLL maintains a dribble file throughout each session.  For all practical
purpose, every character which appears on your screen, whether typed by
you or some LISP function, is preserved; and can be re-examined by simply
reading this file.  Unfortunately this dribble file is not closed
if the host system crashes; and so its contents are not preserved in such
cases.  As these situations reflect one of the major reasons for having
this dribble file in the first place, we take great pains to keep this
file as complete as we can.  This is achieved by periodically closing,
and reopening, the current dribble file.  The user is informed that
this has happened by a
  *** Am now ...
message, each time this reopening process occurs.

By the way, the file is named TRACE<x>.<mon><date>, where the value of
<x> begins as "", and the progresses from "A" to "B", and so on.
<mon> is the three letter abbreviation for the month, and <date> is 
the date's two digits.  So TRACEC.Jul24 would be the fourth dribble file
opened in July 24.

CORLL provides the option of printing out messages, indicating that
some unit now being read in (from an external file), or being bumped out.
Such information is printed on the trace file, stored as the value of
the variable, UP.TRACEFILE.  Setting this to T prints this data out
on the terminal -- which is useful only for debugging problems in CORLL.
The value NIL suppresses this printout all together -- this is prefered
setting.

The RLL environment has a whole slew of (to me at least) useful macros.
Some are very RLL dependent:
EU is a general purpose command to edit the following unit -- working quite
analogously to EP or EF.
Within the InterLisp editor, while editing a unit, there are several
additional commands:
Abort tells the editor to IGNORE all changes made to this unit -- ie this unit
should retain the set of slots and values it had before this edit began.
SimplePut (or SP) tells the editor to use the simple UA-PUTPROP function to
store the new, changed values on this unit -- and not the more time consuming
PutValue.  (Realize that it is that full PutValue function which does all the
KB maintance -- such as updating inverse links.  These updates will NOT be
made when the user exits with SP -- therefore use this quick process sparingly,
when you are sure you will NOT miss these side effects.

SOS Current -- creates a new sysout, named Current.  The standard options
are here given to the RLL-SYSOUT function -- these are value of the variable
?.

--- RLL Independent ---
The editor command "W" expands to "21 UP P" -- this puts you at the next
"window".  
DCL enters the desired Declaration statement, for the current LAMBDA, NLAMBDA
or PROG statement.
!EE evaluates the current expression, and then edits this value.
INIT initializes the variables which appear bound in a PROG statement.

	General comments
Any value stored in a unit's Descr slot runs the risk of one day being TEXed,
and printed out as part of a larger document. (Appendix ? of [Greiner&Lenat]
was formed that way.)  So feel from to throw in TEX commands there --
some useful, predefined macros include \Unit (foo) , \Slot (sl) and 
\UnitSlot (U:S).

RLL maintains a (weak version of a) user profile: a unit is devoted to
storing facts about each user.
The primary use of such information, as of now, is to store the user's
default preferences, which are used when a KB is openned or written.
At such times, RLL has several options for
actions to take.  For example, it may decide to preserve a copy of each KB
openned read/write; or to compress the file, as it is closed.
Each user can preserve his usual options on that unit.
The default ones are the ones I found useful.  
<<Here indicate what the values represent, and how to change them.>>

-----
Global variables:
UF.NETWORKS - this lists the KBs currently only.  Each name is full.
	[The ACCESS property of each name is BOTH if this hash file is readwrite]
UP.BUMPFLG - the value T here means units may be bumped.
UP.INCORELIST - lists the units whose most recent values are now "incore".
	See the CORLL document to figure how to use these 
Section 3, Part A:
	<Read in file DEMO[RDG,DBL] - omitting meta stuff at its beginning.>
Section 3, Part B:
	Here use a real example - perhaps taking from Steve's stuff, or from
my stuff for Rand.
	Conclusion

RLL is still incomplete.  There are a variety of errors and omission present
in this proto-system.  First, there are many features and facilities
which clearly should be added; I just haven't got around to adding them.
Second are features which I haven't thought of; but which I'd agree to
consider adding on a moment's thought.
Aside from these omissions, RLL got several bugs.
I've used a particular convention to tag some of the more obscure ones:
at various places of the code I have dropped Warning statements --
usually of the form "I never figured I'd get here. Now what?".
The Warning function will first print out the message, and then ask
whether or not to break.  The general advice, at these points, is to
type N, and go on.
In addition, please jot down a quick message to send to me,
noting which error-message was printed, and indicating
how to reach this place again.  Such examples will provide the test-bed
I would otherwise have to laboriously create or contrive, but which
are needed to help eradicate this bug.
... and, of course, there will simply be out-and-out bugs in RLL.
Do let me know about these as well.

Appendix A: Creating a new RLL system

Figure RLL will require about 1500 (TOPS20 size) pages --
about 500 for the CORLL system (which, of course, other users may use),
about 300 pages permanent storage for RLL stuff, and about 700 pages
for temporary storage.
As additional KBs are created and filled in, much more will be needed.

FTP from Rand-Ai the following files:

from <GREINER.CORLL> - essentially everything, (CORLL.EXE is unneeded,
and may not even work, depending on your version of LISP.EXE)

[XUTIL.*, XUTILM.*, PROP.*, CORLL.*, CORLLUTIL.*, CORLLINIT.*, XRECORD.*,
XHASH.*, ...]

[Also some files which must be present on <LISP> or <LISPUSERS> -- in particular,
HASH.COM and PERMSTATUS.COM]

From <GREINER.RLL>
UTIL.*, RLL.., RLL.COM, and the KB files
RL.KB, SLOTS.KB, USERS.KB, LISPFNS.KB
(If there are more recent *.XKB files, use them.)

From <GREINER>
INIT.LISP
(and maybe EXTRA - depending on your version of LISP)
	you will probably want to augment your current INIT.LISP file to
include much of the stuff in here.
In particular, the XUTIL.COM and XUTILM.COM must be loaded in; and the
PUTPROP onto REC-DECL is essential (at least if you intend to run interpretedly;
and maybe other cases as well.)
The file names may have to be adjusted -- ie mutatis mutandis.
---
See if the characters for backquote and quote are ? and ? respectively.
If so, edit the XUTIL file - (actually XUTILCOMS) to add on your system
to its list of "known" hostnames.

Set BACKUPDIR (a variable defined in UTILVARS) to be the desired "temporary"
directory, on which to stash ephemeral junk.

You may wish to junk old TRACE*.* files there -- to avoid cluttering up
the overall "permanent" space.
Bibliography

[Greiner80] - "RLL-1: ..."
[Greiner & Lenat80a] - AAAI
[Greiner & Lenat80b] - Details of RLL
[Lenat, Hayes-Roth, & Waterman] - Cognitive Economy
[Smith] CORLL
[Genesereth, Greiner & Smith] - MRS